Adaptable Fault Tolerance Configurations for Multiprocessor Systems
نویسنده
چکیده
The escalating increase in the complexity of multiprocessor systems increases the probability of faults occurring in these systems As a consequence there is a great need for achieving fault-tolerance of processing in multiprocessor systems. Faulttolerance generally requires some forms of hardware and/or time redundancy. Two fault tolerant configurations are proposed for both single and double transient and permanent faults in any processor of multiprocessor systems. The tolerance for faults takes place in three consecutive steps; fault detection, fault diagnosing and system recovery. The overhead cost for the first (second) configuration is only 100% hardware (time) for fault detection, an extra 100% time for fault diagnoses and system recovery only for those processes running on the faulty processors. The advantages of the proposed configurations are the ease of applicability and the low associated overhead cost over the system without any fault tolerance. An enhancement is developed for both configurations to check upon the system state adequately to detect and recover from faults as soon as they infect the system. Simulations are performed to illustrate the usefulness of the proposed configurations. General Terms Fault Tolerance, Multiprocessor Systems.
منابع مشابه
Fault Tolerance for Multiprocessor Systems Via Time Redundant Task Scheduling
Fault tolerance is often considered as a good additional feature for multiprocessor systems but nowadays it is becoming an essential attribute. Fault tolerance can be achieved by the use of dedicated customized hardware that may have the disadvantage of large cost. Another approach to fault tolerance is to exploit existing redundancy in multiprocessor systems via a task scheduling software stra...
متن کاملAnalysis of Selective Fault - Tolerant , Hard Real - Time
An increasing number of applications are demanding real-time performance from their multiprocessor systems. For many of these applications, a failure may produce disastrous results. Such failures are avoided in hard real-time systems by the use of fault-tolerance. In hard real-time multiprocessor scheduling, this fault tolerance may be provided by including several task backups in each schedule...
متن کاملSoftware fault-tolerance in the Pluribus
Over the past decade, the decreasing cost of minicomputer components has encouraged the use of multiprocessor techniques in the design of high-speed, cost-effective computer systems. Multiprocessor architectures have two principal advantages over conventional single-processor designs. First, a multiprocessor system can achieve greater computational speed through parallelism in its task structur...
متن کاملAnalysis of a Fault-Tolerant Multiprocessor Scheduling Algorithm
Fault tolerance is an important aspect of real-time computer systems, since timing constraints must not be violated. When dealing with multiprocessor systems, fault tolerance becomes an even greater requirement, since there are more components that can fail. In this paper, we present the analysis of a faulttolerant scheduling algorithm for real-time applications on multiprocessors. Our algorith...
متن کامل